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Fourier spectral estimates and, to a lesser extent, the au- 
tocorrelation function are the primary tools to detect period- 
icities in experimental data in the physical and biological sci- 
ences. We propose a new method which is more reliable than 
traditional techniques, and is able to make clear identifica- 
tion of periodic behavior when traditional techniques do not. 
This technique is based on an information theoretic reduction 
of linear (autoregressive) models so that only the essential fea- 
tures of an autoregressive model are retained. These models 
we call reduced autoregressive models (RARM). The essential 
features of reduced autoregressive models include any period- 
icity present in the data. 

We provide theoretical and numerical evidence from both 
experimental and artificial data, to demonstrate that this 
technique will reliably detect periodicities if and only if they 
are present in the data. There are strong information theo- 
retic arguments to support the statement that RARM detects 
periodicities if they are present. Surrogate data techniques are 
used to ensure the converse. Furthermore, our calculations 
demonstrate that RARM is more robust, more accurate, and 
more sensitive, than traditional spectral techniques. 



I. INTRODUCTION 

Periodic, and nearly periodic, behavior is a common 
feature of many biological and physical systems and there 
exist several widely-used techniques to estimate the pe- 
riod of a behavior, for example, spectral estimation [Q, 
autocorrelation Q|, spectrographs, band pass (comb) fil- 
ters Q and wavelet transforms HQ. All of these stan- 
dard techniques either employ, or are related to, or are a 
generalization of, Fourier series. 

In this paper we propose an alternative method of de- 
tecting periodicity that is not so closely related to Fourier 
series. This new technique applies ideas from information 
theory to linear autoregressive models of time series to 
extract evidence of periods. 

The basic principle is the following. Given a time series 
{yt}tLi one can propose a linear autoregressive model 
AR(n) by 



Vt = aiVt-i + a 2 y t -2 + a a y t -3 + ■■■ 
+a n y t - n + e t t = n + 1, n + 2, 



, N. 



(1) 



as the modeling errors [|lj,[3|. Under these assump- 
tions the maximum likelihood estimate of the parame- 
ters ai, 02, . . . , a n can be written in terms of a covariance 
function, and are therefore related to the autocorrelation 
function and Fourier spectrum. It is common practice to 
determine the optimal size n of the model by using cither 
the Akaike Q or the Schwarz |?j information criteria; this 
is done to avoid over-fitting of the time series || . It has 
recently been observed that a further optimization of an 
AR(n) model may be possible by deleting some of the 
terms to obtain a model 

Vt = ao + aiVt-t! + a 2 yt-t 2 + ai iVt-t 3 + ••■ 

+a k yt-e k +e t , (2) 



where, 



1 < k < h < h < ■ ■ ■ < 4 < n 



where et are assumed to be independent and identi- 
cally distributed random variables, which are interpreted 



for li £ Z + i = 1, 2, 3, . . . , k. The hope is to obtain 
a model that fits the time series equally well, but has 
far fewer parameters. Profound theoretical arguments, 
which are a codification of Occam's razor, imply that if 
a reduced autoregressive model (RARM) is suitably opti- 
mized, then it is superior to an equivalent autoregressive 
model AR(n). The key principle of this paper is that if 
one has an optimized RARM, that is the RARM that 
has been reduced to only the essential terms, then the 
parameters £±, £2,(3, ■ ■ • , £ki often called lags, provide in- 
formation about the periodicity of the time series. 

A practical procedure for obtaining an optimal reduced 
autoregressive model (RARM) has been described by 
Judd and Mees ||. This procedure was introduced in 
the more general context of nonlinear modeling, but in 
the following section we describe briefly the underlying 
theory in the context of RARM. 

The major part of this paper is aimed at presenting 
evidence that examining the lags of an optimal RARM 
provides a more robust and accurate means of detect- 
ing periods in time scries than traditional spectral tech- 
niques. That is, the proposed technique unambiguously 
identifies periodicities even when spectral methods fail 
to do so, and furthermore, does not falsely suggest the 
presence of periods when none are present. The evidence 
presented is a combination of theoretical argument and 
numerical procedures, which are illustrated with both ar- 
tificial and experimental data. 

An important numerical procedure that will be used 
to establish that the proposed technique does not falsely 
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identify periods is surrogate data analysis. The prin- 
ciple of surrogate data analysis is the following. From 
experimental data one generates artificial data that are 
"similar" to the experimental data and satisfy a given 
hypothesis. One then calculates a test statistic for each 
surrogate data set, and hence obtains an ensemble of 
statistic values that estimate the distribution of the test 
statistic under the assumption that the original data is 
consistent with the given hypothesis. One then compares 
the statistic value of the original data with the estimated 
distribution of the surrogates. If the data has an atypical 
statistic value then the hypothesis will be rejected, other- 
wise it should be accepted. In this paper we employ this 
technique to ensure that RARM procedures do not spu- 
riously identify periodicities in temporally uncorrelated 
surrogate data. 

A. Minimum description length 

The criteria we use for determining the optimal RARM 
is the minimum description length. Occam's razor recom- 
mends that the best description of a phenomenon is the 
shortest description. This principle can be made rigorous 
using information theory, and the principle was indepen- 
dently developed by Wallace |l(]] and Rissanen pd[ |. 

Operationally the principle is applied as follows. Sup- 
pose you have a time series {yt}tLi given to a certain 
fixed accuracy and that you wish to communicate the 
data to a colleague. To send the raw data would re- 
quire a certain number of bits. Alternatively, one could 
build a predictive model, of the form (Q) for example, 
and then send the model parameters (to some precision) , 
the initial 4 observations, and the differences between 
the model's predictions and actual observations. Given 
this information your colleague can reconstruct the orig- 
inal data. If the model of the time series is good, then 
the total number of bits required for parameters, initial 
conditions and prediction errors is less than the number 
of bits of raw data, because the differences between the 
predicted and actual observations are smaller than the 
observations. The total number of bits sent in the sec- 
ond case is called the description length, and the model 
that achieves the minimum description length is the one 
recommended by the application of Occam's razor. The 
dogma is that this model achieves the best prediction of 
the data without over-fitting. 

In practice it is usually sufficient to estimate the de- 
scription length of a model, rather than calculate it in 
detail. An estimate will usually have the form 

(description length) sa (number of data) 

x log (sum of squares of prediction errors) 

+ (penalty for number and accuracy of parameters) . 

Following Judd and Mees || the description length of 
a RARM can be estimated as follows. Given a time series 
{yt}tLi define a set of vectors {T^}™ =1 by 



V = (1,1,..., If, 
Vi = (y n , ■ ■ .,yN-i) T , 
V 2 = (y n -i> ■ ■ -,yN-2) T , 

Vj = (Vn-j+U ■ ■ -,yN-j) T , 

V n = (yi, . . .,y N - n ) T , 

and define 

y = (y n +i, ■ ■ -,yN) T - 

Observe that if the model (J2J) is appropriate for the time 
series one can write 

fe 

y = X! aiV ^ + e ( 3 ) 

= V B a B + e, 

where B = (h, 4), V B = [V h \V l2 \ ■ ■ ■ \V h ] is a 

matrix, and a B — (ai, 02, ■ ■ ■ , ak) T ■ The maximum like- 
lihood estimates of a B , that is, the values that minimize 
e T e, are given by 

Now each parameter aj must be sent to some precision 
Sj, that is, the maximum likelihood estimate of aj is 
"rounded-off " by an amount 5j . It can be shown ]9| that 
the optimal precisions S = (8±, 62, • ■ ■ , 5k), that is, the 
round-off for each aj that gives the minimum description 
length, satisfy 

(QS)j - i/8j 

where 

q _ -NV^Vb 

{a B VB - y) T {a,BVB - y) 

Consequently, it can be shown || that the approximate 
description length of the RARM (§) is 

y(i + m^) + Q + m 7 ) fc -Ein^ (4) 

where 7 is a constant depending on the overall scale of 
the data. 

Armed with this estimate of the description length of a 
RARM one can search over all combinations of lags B = 
(ti,l2, ■ ■ ■ ,4) to obtain the optimal RARM, however, 
Judd and Mees Q describe a fast and efficient method of 
doing this optimization. 
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II. DETECTING PERIODICITY USING 
OPTIMAL RARM 

A function / is periodic with period r if /(f) = f(t + 
t) for all t. A time series (assumed stationary) has an 
(approximate) periodicity of period t if y t ~ Ut+r for 
all t, or, equivalently, the autocorrelation p has a local 
maximum at t. The reduced autoregressive model (|^) 
predicts the current value of a time series y± as a weighted 
average of the previous values, that is, at the time steps 
£l, i-2} ■ ■ •) and Ik previous to t. If a time series has 
periodic behavior, then the lags ■ ■ ■ ,£k should be 

(multiples of) the periods. 

We claim that one can detect in time series a periodic- 
ity of period < Umax by the following procedure, called 
the RARM procedure. For n = 0, 1,2,3,..., Umax build 
optimal reduced autoregressive models of the form (0) 
using the algorithm described by Judd and Mees [gj. 
For each model in this sequence calculate its description 
length (||) and take as the overall optimal model that 
model with the smallest description length. We claim 
that if the overall optimal RARM is non-trivial, then 
the lags £i, £ 2l ■ ■ ■ , Ik should be (multiples of) the peri- 
ods < n max in the original time series if the time series 
is sufficient long. 

In order to establish our claim we must demonstrate 
that 

i. i/the time series contains a period then the RARM 
procedure detects this periodic behavior, and 

ii. i/the RARM procedure detects a period then there 
is periodic behavior in the time series. 



In section II A we provide a theoretical argument to es- 
tablish the forward implication (|). In section [II B| we 
discuss an essential procedure for ensuring (|j|) . 



A. Forward implication (|i|) 

The argument to establish the forward implication pro- 
ceeds as follows. First, we observe that a period in a time 
series will (regardless of whether it is linear or nonlinear) 
produce a local maximum in the autocorrelation func- 
tion p(r). Next it is shown below that, in the optimiza- 
tion of a RARM of given maximum size n, the criterion 
for inclusion of a particular term a^yt-i^ in (|J) is closely 
related to the magnitude of the autocorrelation at £j, 
p(ij). Hence, if n is large enough, the optimal RARM 
will include a term corresponding to this periodicity. Ris- 
sanen's minimum description length criterion guarantees 
that provided the time series is sufficiently long this will 
always be the case and so the RARM procedure will al- 
ways detect periods that are present in a time series, 
provided the time series is sufficiently long. 

The remainder of this section elaborates on the detail 
of this argument. A period r in a time series {yt}tLi 



of N scalar measurements is a strong positive correla- 
tion between values separated by r time steps, i.e. the 
autocorrelation 



P(r) 



(y-y) T (V T -y) 



J2n+i(y-y) 2 



(5) 



has a local maximum at r. Without loss of generality we 
may assume that y = 0, and therefore (||) reduces to 



P(r) 



Ylv 

y T y ' 



(6) 



Let the set of lags for the optimal RARM of size k be 
denoted by B k = {tf\ if\ . . . , The vector B k 

uniquely determines the least squares model 



^a^V^) + e. 



i=l 



Define 



L(r) 



V?y-Y, a l V ? V ^ 



T 

y y 



i=l 



oik)- 



(7) 



According to the algorithm of Judd and Mees 0, given 

(k) 

Bk and a B , the next best term to add to the model has 
the lag t that maximizes L(t). However, identity (Q) 
implies that such a r is a local maximum of p(r). 

Rissanen's minimum description length ensures that, 
for sufficiently large N, "if there is any machinery be- 
hind the data, which restricts the future observation in 
a similar manner as the past and which can be captured 
by the selected class of parametric functions, then we 
will find that machinery" p"l| ] . The argument in the pre- 
ceding paragraphs demonstrates that RARM are a suf- 
ficiently broad class of parametric functions to capture 
"machinery" behind the data, including observed peri- 
odicities. Thus, if periodicity is present in the data then 
RARM techniques will detect it — provided N is suffi- 
ciently large. This ensures the forward implication (B). 



B. Reverse implication (^): Surrogate data 
techniques 

In order to establish that the RARM techniques does 
not falsely identify a period when none is present, the 
numerical procedure of surrogate data analysis can be 
used. The technique of surrogate data was originally in- 
troduced by Theiler and colleagues p2| . They suggest 
three surrogate generation techniques to address three 
different hypotheses about a time series, but for our pur- 
poses we only use Theiler's algorithm surrogates. 
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In the present case we are interested in whether a time 
series contains periodicities, or said in another way, we 
wish to test the null hypothesis that the time series con- 
tains no periodicities, that is, has no temporal correla- 
tion. Thcilcr's algorithm generates surrogate time se- 
ries having no temporal correlation by simply shuffling 
the original time series, or put another way, the surro- 
gates are i.i.d. noise having the same same rank distri- 
bution as the original time series Q . 

Our proposal is to use optimal RARM as the test for 
periodicity, that is, if the optimal RARM is non-trivial 
in that k > in (g), then periods are present in the time 
series. To believe the validity of this test one must re- 
quire that if the optimal RARM detects a period in a 
time series, then it must not detect any period in algo- 
rithm surrogates ]l3| , |l4]] . This surrogate test must be 
applied to each data set for which an optimal RARM has 
been constructed to ensure that the structure detected 
in each data set is genuine. That is, we propose that an 
algorithm surrogate test is a necessary part of the pro- 
cedure of detecting periodicity using an optimal RARM. 
If RARM methods identify periodicity in the surrogates 
then this is clear evidence of false identification of peri- 
odicity in the data. However, if the RARM algorithm 
detects no periodicity in the surrogates then periodicity 
identified in the original data is genuine. To ensure the 
reverse implication (^) holds one need only apply an al- 
gorithm surrogate calculation. 



III. CALCULATIONS 

In this section we demonstrate with artificial and ex- 
perimental data that RARM detects periodic behavior (@) 
if and (^) only if it is present in the original time series. 
To demonstrate that RARM detects periodic behavior 
if it is present in the data we construct artificial data 
contaminated with noise and demonstrate the effective- 
ness of the RARM algorithm. We compare the RARM 
results to traditional Fourier spectral and autocorrela- 
tion techniques. We repeat these calculations for some 
experimental data comparing the RARM algorithm and 
traditional techniques. To demonstrate that our RARM 
algorithm detects periodic behavior only if it is present 
in the data we app ly the method of surrogate data. 

In section III A we describe the application of these 



techniques to detect periodicities in recordings of infant 
respiratory patterns during natural sleep. Section HI B 



applies these methods to artificial data sets to demon- 
strate the effectiveness of th ese te chniques compared to 
traditional methods. Section III C describes the applica- 



A. Infant respiratory data 

Using inductance plethysmography we have collected 
measurements of cross-sectional area of the abdomen of 
infants during natural sleep. From these measurements 
we extract a measure that can be related to the breath 
volume |l5| . Figure [l] gives an example of data collected 
in this way. 

We applied our RARM procedure to the data illus- 
trated in figure ^ and obtained a model of the form 



y t = a + aiyt-i + a 2 yt-6 + e* 



(8) 



tion of these same methods to global climatic data. 



where an « 2.945206, a\ « 0.300739 and a 2 ~ 0.202056. 
Figure g shows the result of analysis of this data set 
with a fast Fourier transform algorithm (MATLAB's 
spectrum command.) and an estimate of the autocorre- 
lation function. Both these techniques yield small peaks 
at the same value (that is, 6) and are consistent with the 
results of our RARM algorithm. However, the results 
are not as unambiguous as the results of the RARM al- 
gorithm. That is, the RARM detects a periodicity that 
is not strong enough to be unambiguously identified by 
spectral methods. 

For many time series of breath size Jl6| we have com- 
puted autocorrelation and Fourier spectral estimates. We 
have applied our RARM algorithm to each data set and 
compared this to the result of applying traditional tech- 
niques. For these data the period of periodic behavior 
detected by the RARM algorithm is consistent with the 
periods detected by autocorrelation. That is, if RARM 
detects periodic behavior, then it is of the same period 
as that detected by the autocorrelation estimate (if the 
autocorrelation detects periodic behavior). Furthermore, 
if RARM does not detect periodic behavior, then neither 
does the autocorrelation estimate. The traditional tech- 
niques will often fail to detect periodic behavior when 
the RARM algorithm does detect it. 

We have provided experimental evidence that the 
RARM technique detects periodic behavior when it does 
occur. Now we will demonstrate that the RARM tech- 
nique does not lead to spurious identification of periodic 
behavior. That is, we will show that if the RARM al- 
gorithm detects periodic behavior, then there is periodic 
behavior in the data. To do this we apply a surrogate 
data algorithm which will ensure that false indications of 
periodicities can always be identified. 

For the data illustrated in figure |l|, none of 100 surro- 
gates generated by shuffling the data exhibited periodic 
behavior of any period. This calculation was repeated 
with another 48 data sets Q . In all 49 cases the RARM 
failed to detect periodic behavior in the surrogate data 
in at least 99 (of 100) surrogates of each data set. This 
indicates that the RARM algorithm does not identify pe- 
riodicities not present in the data. 
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B. Artificial data 



IV. CONCLUSIONS 



In this section we use the optimal RARM from section 



III A as a basis for generating noisy artificial data with a 
known periodicity. From rt8|) we use the model 



Vt = ao + am-i + a 2 yt-e + e t 



(9) 



(where a « 2.945206, ai ~ 0.300739 and a 2 « 0.202056, 
as above) to generate an artificial data set y. To this 
data we add observational noise et and apply the above 
analysis to the series z, z t = yt + ej- Figure ^| demon- 
strates the result of this technique for an artificial data 
set of the same length as the data and normal observa- 
tional noise with standard deviation 1 (et, et ~ N(0, 1)). 
Figure f| is the result of the same technique for a longer 
data set (5000 data points) and more observational noise 
(e t ~ N(0,1) and e t ~ N(0,2)). In both cases RARM 
clearly identified periodic behavior with period 6. For the 
time series in shown in figures | and | we constructed 100 
algorithm surrogates, none of them exhibited period- 
icity detected by RARM. 

The traditional Fourier spectral and autocorrelation 
techniques identify the same period as the RARM tech- 
nique for the shorter, but less noisy data illustrated in fig- 
ure However, for the data shown in figure || the RARM 
technique has identified periodicities that are not obvi- 
ous from traditional techniques. Furthermore, it should 
be noted that in all cases that the results of the autocor- 
relation and spectral methods are not clear cut. For rea- 
sonably long, but extremely noisy data sets the RARM 
algorithm still provides a decisive and accurate estimate 
of the period of periodic behavior present in data. 



C. Global climatic data 

In this section we describe the application of these 
techniques with noisy physical data. The time series we 
use here is monthly deviations from monthly mean global 
air temperatures over the period 1856-1997 17 1. These 
global air temperature measurements are obtained by av- 
eraging observations at many spatially separated sites on 
the globe. Figure || shows the complete data set. A 
more detailed discussion of this data may be found in 
p8| . Analysis using the methods described in this paper 
demonstrates the presence of periodic fluctuation over 
periods of 7 months, 2 years and 45 months Jp|. Fourier 
spectral and autocorrelation estimates were also applied 
(after de-trending this time series) and the results are il- 
lustrated in figure ^. From 100 algorithm surrogates 
RARM did not detect periodicity in 99 of them. These 
results demonstrate the presence of genuine periodic fluc- 
tuation in this time series and that the fluctuation is 
difficult to detect with traditional techniques. An advan- 
tage of the RARM technique is that no de-trending is 
required. The results of the RARM algorithm are not 
effected by trends or non-stationarity. 



We have provided theoretical and experimental evi- 
dence to support the use of RARM techniques to detect 
periodic behavior in noisy experimental time series. The 
concept of minimum description length ensures that a 
RARM built with an MDL modeling criterion will detect 
any periodicities present in the data. We provided nu- 
merical evidence using experimental and artificial data to 
support this. Moreover these calculations have demon- 
strated that the RARM algorithm provides an accurate 
and decisive method of detecting periodicities that is 
more sensitive than Fourier spectrum or autocorrelation 
methods. 

By applying surrogate data techniques we have demon- 
strated that the RARM algorithm did not identify peri- 
odicities in temporally uncorrelated surrogates. This is 
strong experimental evidence that the RARM algorithm 
is robust against identification of false periodicities. It 
does not identify behavior not present in the original sys- 
tem. However this result has only been supported by nu- 
merical evidence and does not imply that true identifica- 
tion with arbitrary data. To guard against false positives 
we recommend application of surrogate data tests, as dis- 
cussed in this paper. Periodicity detected using RARM 
are genuine provided RARM detects no periodicity in 
i.i.d. surrogates. 
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FIG. 1. Tidal volume: The horizontal axis is breath number — each datum in this time series corresponds to a single 
breath. The vertical axis is derived from the output from the analogue to digital converter (proportional to cross-sectional 
area measured by inductance plethysmography, arbitrary units). For each breath the minimum and maximum value over that 
breath were calculated and the difference recorded. This data set consists of 762 points recorded from a 21 week old male during 
24 minutes of continuous stage 2 sleep. This study had approval from the ethics committee of Princess Margaret Hospital. 
The parents of this subject were informed of the procedure, and its purpose, and had given consent. The recording took place 
during a scheduled overnight sleep study at Princess Margaret Hospital. 
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frequency (breath 1 ) period (lag, x) 

FIG. 2. Spectral techniques: Estimates of the power spectrum (arbitrary units) and autocorrelation function for the 
data illustrated in figure |l[ The RARM detected periodic motion over a period of 6 data points, see equation (^) . A vertical 
dot-dashed line marks the location of period 6 behavior in both the frequency (power spectrum) and time (autocorrelation) 
domain. A peak in the autocorrelation function corresponds exactly with the period 6 behavior detected by RARM. The power 
spectrum has a peak close to a frequency of 6 _1 ~ 0.166667. A period of 6 is the closest integer value to the peak evident at 
this location in the power spectrum. Whilst both power spectra and autocorrelation detect behavior with a period of 6 these 
results are not as conclusive as the RARM algorithm. 
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FIG. 3. Artificial data: A data set of 764 realization of the process described by (|9j) with normal observational noise, 
standard deviation 1. This linear model is of the same form as that predicted from the model of the data in figure [[]. Also 
shown in the power spectrum (arbitrary units) and autocorrelation estimate for this data set. For this data set RARM gave a 
clear indication of period 6 behavior. The dot dashed line on the power spectrum and autocorrelation function corresponds to 
the period of 6 detected by RARM. 
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FIG. 4. Artificial data: Data from an reduced autoregressive of the same form as that predicted from the model of the 
data in figure [lj. This data sets consists of 5000 realizations of Q with observational noise, standard deviation 2. Also shown 
in the power spectrum (arbitrary units) and autocorrelation estimate for this data set. For this data set RARM gave a clear 
indication of period 6 behavior. The dot dashed line on the power spectrum and autocorrelation function corresponds to the 
period of 6 detected by RARM. 
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FIG. 5. Global air temperature: Monthly global air temperature measured as deviation (in degrees Celsius) from monthly 
mean temperature for the period 1856-1997 (1704 data). 
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FIG. 6. Spectral techniques: Estimates of the power spectrum and autocorrelation function for the data illustrated in 
figure |f| The data in figure was linearly de-trended before calculating Fourier spectrum and autocorrelation. The RARM 
detected periodic motion over a period of 7, 24 and 45 months. A vertical dot-dashed line marks the location of period 7, 24 
and 45 behavior in both the frequency and time domain. A peak in the autocorrelation function corresponds exactly with the 
period 24 and 45 behavior detected by RARM. The power spectrum has a peak close to a frequency of 45 _1 ~ 0.0222. Whilst 
both power spectra and autocorrelation detect behavior with a period of 24 and 45 these results are not as conclusive as the 
RARM algorithm. 
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